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Remarks 

The present amendment replies to the Official Action mailed August 3. 2001 . That action 
objected to the specification as informal. Claims 1-3, 9, 10, and 38-46 were rejected under 35 
U.S.C. 103 (a) over Herrell et al. U.S. Patent No. 5,301,287. Claims 4-7 and 1 1-13 were rejected 
under 35U.S.C. 103 (a) over Herrell in view of McLellan et al. U.S. Patent No. 5,890,201. Claims 
8, 14, and 15 were objected to as being dependent upon a rejected base claim, but were indicated to 
be allowable if rewritten in independent form including all limitations of the base claim and any 
intervening claims. Each of the points raised by the Official Action is addressed below following a 
brief discussion of the present invention to provide context. 

Claims 1 and 42 have been amended to be more clear and distinct. Claims 16-37 have been 
previously cancelled. Claims 1-15 and 38-46 are presently pending. Attached hereto is a marked- 
up version showing the changes made to the specification and claims by the current amendment. 
The attached pages are captioned Version with Markings to Show Changes Made . 

The Present Invention 

The present invention relates generally to improvements in array processing, and more 
particularly to advantageous techniques for providing improved mechanisms of data distribution 
to, and collection from multiple memories often associated with and local to processing elements 
within an array processor. 

Various prior art techniques exist for the transfer of data between system memories or 
between system memories and I/O devices. Fig. 1 of the present application shows a 
conventional data processing system 100 comprising a host uniprocessor 1 10, processor local 
memory 120, direct memory access (DMA) controller 160, system memory 150 which is usually 



6 



5 : 1 1 PM ; PR ! EST TAW OFFICES ;919 969 7844 



a larger memory store than the processor local memory, having longer access latency, and 
input/output (I/O) devices 130 and 140. 

The DMA controller 160 provides a mechanism for transferring data between processor 
local memory and system memory or I/O devices concurrent with uniprocessor execution. DMA 
controllers are sometimes referred to as I/O processors or transfer processors in the literature. 
System performance is improved since the host uniprocessor can perform computations while the 
DMA controller is transferring new input data to the processor local memory and transferring 
result data to output devices or the system memory. A data transfer is typically specified with 
the following minimum set of parameters: source address, destination address, and number of 
data elements to transfer. Addresses are interpreted by the system hardware and uniquely 
specify I/O devices or memory locations from which data must be read or to which data must be 
written. Sometimes additional parameters are provided such as element size. 

One of the limitations of conventional DMA controllers is that address generation 
capabilities for the data source and data destination are often constrained to be the same. For 
example, when only a source address, destination address and a transfer count are specified, the 
implied data access pattern is block-oriented, that is, a sequence of data words from contiguous 
addresses starting with the source address is copied to a sequence of contiguous addresses 
starting at the destination address. 

Array processing presents challenges for data collection and distribution both in terms of 
addressing flexibility, control and performance. The patterns in which data elements are 
distributed and collected from processing element local memories can significantly affect the 
overall performance of the processing system. With the advent of the ManArray architecture it 
has been recognized that it will be advantageous to have improved techniques for data transfer 
which provide these capabilities and which are tailored to this new architecture. 
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The present invention addresses a variety of advantageous methods and apparatus for 
improved data transfer control within a data processing system. In particular, improved 
techniques are provided for: distributing data to, and collecting data from an array of processing 
elements (PEs) in a flexible and efficient manner; and PE address translation which allows data 
distribution and collection based on PE virtual IDs. 

Further aspects of the present invention are related to a virtual-to-physical PE ID 
translation which works together with a ManArray PE interconnection topology to support a 
variety of communication models (such as hypercube and mesh) through data placement based 
upon a PE virtual ID. This result can be accomplished in a DMA controller by translation, 
through a VID-to-PID lookup table or through combinational logic, where the resulting PID 
becomes an addressing component on the DMA bus to PE local memories. This result can also 
be achieved at the PE local memories within the interface logic, where a VED available to the 
interface logic is compared to a VID presented on the DMA bus. A match at a particular 
memory interface allows that memory to accept the access. 

Objection to Specification 

The specification has been amended to address the informality objection. 

35 U.S.C. 103 Rejections 

The claims as presently amended make it clear that the present invention addresses 
techniques for improved array processing in which mechanisms are provided for effective data 
distribution to and collection from multiple memories associated with and local to processing 
elements within the array processor. 
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By contrast, Herrell relates to aspects of an entirely different problem. Herrell states that it 
"relates to a method and apparatus for providing direct access by an external data processing system 
to data stored in the main memory of a host system, and more particularly, to an interface method 
and apparatus for providing direct memory access by an external data processing system, such as a 
graphics subsystem, to virtual memory of the host system by transferring the contents of main 
memory of the host system at a location in virtual memory space specified by the user to the 
external data processing system under the user's control." Col. 1, lines 10-20. Reconsideration and 
withdrawal of the present rejection are respectfully requested. 



All of the claims standing in order for allowance, this case should be promptly allowed. 
Should there be any issues which might be expedited by a telephone call, the Examiner is 
requested to call the undersigned at the number below. 



Conclusion 



Respectfully submitted, 




Pdter H. Priest 
Reg. No. 30,210 



Priest & Goldstein, PLLC 
529 Dogwood Drive 
Chapel Hill, NC 27516 
(919) 942-1434 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 

In the Specification 

Please change the first two sentences of the present application as follows: 
[This] The present application is a division of [application] U.S. Application Serial Number 
09/472,372 filed on December 23, 1999 . now U.S. Patent No. 6,256.683, which in turn claimed f. 
The present application claims] the benefit of U.S. Provisional Application Serial No. 60/1 13,637 
entitled "Methods and Apparatus for Providing Direct Memory Access (DMA) Engine** and filed 
December 23, 1998 which is incorporated by reference in its entirety herein . 

Please replace the paragraph beginning at page 6, line 1 and extending to page 7, line 19 as 
follows: 

Further details of a presently preferred ManArray core, architecture, and instructions for 
use in conjunction with the present invention are found in U.S. Patent Application Serial No. 
08/885,310 filed June 30, 1997 . now U.S. Patent No. 6.023.753 . U.S. Patent Application Serial 
No. 08/949,122 filed October 10, 1997 . now U.S. Patent No. 6,167.502 . US. Patent Application 
Serial No. 09/169,255 filed October 9, 1998, U.S. Patent Application Serial No. 09/169,256 filed 
October 9, 1998 . now U.S. Patent No. 6 J 67.501. U.S. Patent Application Serial No. 09/169,072 
filed October 9, 1998 . now U.S. Patent No. 6.219.776. U.S. Patent Application Serial No. 
09/187,539 filed November 6, 1998 . now U.S. Patent No. 6.151.668. U.S. Patent Application 
Serial No. 09/205,558 filed December 4, 1998 . now U.S. Patent No. 6.173.389. U.S. Patent 
Application Serial No. 09/215,081 filed December 18, 1998 . now U.S. Patent No. 6.101.592 . 
U.S. Patent Application Serial No. 09/228,374 filed January 12, 1999 . now U.S. Patent No. 
6,216.223 [and entitled "Methods and Apparatus to Dynamically Reconfigure the Instruction 
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Pipeline of an Indirect Very Long Instruction Word Scalable Processor"], U.S. Patent 
Application Serial No. 09/238,446 filed January 28, 1999, U.S. Patent Application Serial No. 
09/267,570 filed March 12, 1999, U.S. Patent Application Serial No. 09/337,839 filed June 22, 
1999, U.S. Patent Application Serial No. 09/350,191 filed July 9, 1999, U.S. Patent Application 
Serial No. 09/422,015 filed October 21, 1999 [entitled "Methods and Apparatus for Abbreviated 
Instruction and Configurable Processor Architecture"], U.S. Patent Application Serial No. 
09/432,705 filed November 2, 1999 [entitled "Methods and Apparatus for Improved Motion 

Estimation for Video Encoding"], U.S. Patent Application Serial No. [ ] 

09/471,217 filed December 23, 1999 , now U.S. Patent No. 6,260.082 [entitled "Methods and 
Apparatus for Providing Data Transfer Control"], as well as, [Provisional Application Serial No. 
60/113,637 entitled "Methods and Apparatus for Providing Direct Memory Access (DMA) 
Engine" filed December 23, 1998, Provisional Application Serial No. 60/113,555 entitled 
"Methods and Apparatus Providing Transfer Control" filed December 23, 1998 J Provisional 
Application Serial No. 60/139,946 entitled "Methods and Apparatus for Data Dependent Address 
Operations and Efficient Variable Length Code Decoding in a VLIW Processor" filed June 18, 
1999, Provisional Application Serial No. 60/140,245 entitled "Methods and Apparatus for 
Generalized Event Detection and Action Specification in a Processor" filed June 21, 1999, 
Provisional Application Serial No. 60/140,163 entitled "Methods and Apparatus for Improved 
Efficiency in Pipeline Simulation and Emulation" filed June 21, 1999, Provisional Application 
Serial No. 60/140,162 entitled "Methods and Apparatus for Initiating and Re-Synchronizing 
Multi-Cycle SIMD Instructions" filed June 21, 1999, Provisional Application Serial No. 
60/140,244 entitled "Methods and Apparatus for Providing One-By-One Manifold Array (1x1 
ManArray) Program Context Control" filed June 21, 1999, Provisional Application Serial No. 
60/140,325 entitled "Methods and Apparatus for Establishing Port Priority Function in a VLIW 
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Processor" filed June 21, 1999, Provisional Application Serial No. 60/140,425 entitled "Methods 
and Apparatus for Parallel Processing Utilizing a Manifold Array (ManArray) Architecture and 
Instruction Syntax" filed June 22, 1999, Provisional Application Serial No. 60/165,337 entitled 
"Efficient Cosine Transform Implementations on the ManArray Architecture" filed November 

12, 1999, and Provisional Application Serial No. [_ ] 60/171,911 entitled 

"Methods and Apparatus for [DMA] Loading of Very Long Instruction Word Memory" filed 
December 23, 1999, respectively, all of which are assigned to the assignee of the present 
invention and incorporated by reference herein in their entirety. 

Please replace the paragraph at page 12, lines 1 -12 as follows: 

Each transfer controller within a ManArray DMA controller is designed to fetch its own 
stream of DMA instructions. DMA instructions are of five basic types: transfer; branch; load; 
synchronization; and state control. The branch, load, synchronization, and state control types of 
instructions are collectively referred to as "control instructions", and distinguished from the 
transfer instructions which actually perform data transfers. DMA instructions are typically of 
multi-word length and require a variable number of cycles to execute although several control 
instructions require only a single word to specify. Although the presently preferred embodiment 
supports multiple DMA instruction types as described in further detail in U.S. Patent Application 

Serial No. [ entitled "Methods and Apparatus for Providing Data Transfer 

Control"] 09/47 L2 17 filed December 23, 1999 . now U.S. Patent No. 6,260.082, and incorporated 
by reference in its entirety herein, the present invention focuses on instructions and mechanisms 
which provide for flexible and efficient data transfers to and from multiple memories. 
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Please replace the paragraph at page 20, lines 13-21 as follows: 

The following aspects of the loop formulation are noted. When the requested number of 
accesses are made (TC in Figs. 10-12) then all loops are exited immediately, leaving all address 
and loop control variables in their current states. By using logical "while" loops and 
reinitializing a loop only at its exit, it is possible to reenter the loops and continue a transfer after 
"terminal count" (TC) addresses have been accessed. This capability is used in this invention to 
allow transfers to be restarted so that the addressing continues as though it would if the transfer 
count had not been exhausted. For further details of such transfers see U.S. Application Serial 

No. [ ] 09/471JZ17 filed December 23, 1999 [entitled "Methods and Apparatus 

for Providing Data Transfer Control"] , now U.S. Patent No. 6,260.082, which is incorporated by 
reference in its entirety herein. 

In the Claims 

1 . (Amended) An apparatus for performing virtual identification (V1D) to 
physical identification (PID) translation for data elements to be accessed within local memory of 
a processing element fPE^ whereby a direct memory access (DMA) c ontroller can access PE 
local memories according to their VIDs, the apparatus comprising: 

an array of multiple PEs each having local PE memory; 

a DMA controller; and 

a memory maintained in the DMA controller for storing a processing element VTD-to- 
PID table mapping processing element VIDs to processing element PIDs utilized by the DMA 
controller to access local memories according to their VIDs . 

42. (Amended) A processing apparatus comprising: 
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a plurality of processing elements (PEs) communicatively connected by a bus, each PE 
comprising a register storing a virtual identification number (VID) identifying the PE; and 

a direct memory access (DMA) controller connected to the bus for accessing local data 
memory of the PEs, each data access at least partially identified by a VID; 

wherein during a common data access to multiple PEs, a PE responds to the data access if 
the VID stored in the register matches the VID of the data access. 
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